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Abstract: Objectives of the NASA Information And 
Data System (NAIADS) project are to develop a 
prototype of a conceptually new middleware frame- 
work to modernize and significantly improve effi- 
ciency of the Earth Science data fusion , big data 
processing and analytics. The key components of 
the NAIADS include: Service Oriented Architec- 
ture (SO A) multi-lingual framework , multi-sensor 
coincident data Predictor , fast into-memory data 
Staging, multi-sensor data-Event Builder, complete 
data-Event streaming (a workflow with minimized 
IO), on-line data processing control and analyt- 
ics services. The NAIADS project is leveraging 
CLARA framework, developed in Jefferson Lab, 
and integrated with the ZeroMQ messaging library. 
The science services are prototyped and incorporated 
into the system. Merging the SCI AM ACHY Level- 
1 observations and MODIS /Terra Level- 2 (Clouds 
and Aerosols) data products, and ECMWF re- 
analysis will be used for NAIADS demonstration 
and performance tests in compute Cloud and Clus- 
ter environments. Keywords: earth science, data 
fusion, framework, event builder 

I. Introduction 

One of the key elements of advancing our understand- 
ing of Earth’s weather and climate via remote sensing 
is integration of diverse measurements into the observ- 
ing system. As remote measurements capture larger 
amounts and higher quality of data, the demand for 
advanced data applications and high-performance in- 
formation processing systems becomes a greater chal- 
lenge. These challenges are outlined in the OS TP 
Guidelines for Civil Space Observations (2013), recog- 
nized in the NASA Strategic Space Technology Invest- 
ment Plan (2013), and addressed in the NASA Strate- 
gic Objective 2.2 and its implementation by "... de- 
veloping new technologies and predictive capabilities, 
and demonstrating innovative and practical uses of the 
programs data and results for societal benefit” (2014). 
The concept of maximizing information content by 
combining coincident multi-sensor data and enabling 
advanced science algorithms, was successfully used by 
several past and on-going projects: CERES experi- 
ment [1] for deriving accurate radiation fluxes, fusion 


of the CERES, MODIS and MISR observations for es- 
timating instantaneous shortwave flux uncertainties, 
and multi-instrument calibration comparison [2, 3], 
fusion of MODIS and PARASOL observations to en- 
hance cloud and aerosol retrievals, fusion of data from 
CALIPSO, CloudSat, CERES, and MODIS (A-Train 
constellation) for comprehensive aerosol and cloud in- 
formation [4] . Advanced science algorithms allowed to 
reduce uncertainty in weather and climate parameters. 
The future satellite constellations and NASA missions: 
RBI, TEMPO, CLARREO, ACE, and GEO-CAPE 
will require tools for efficient data fusion and process 
scaling. 

In response to these challenges, we develop the NASA 
Information And Data System (NAIADS) - a proto- 
type framework for the next generation Earth Science 
multi-sensor data fusion and processing. The NA- 
IADS’ goal is to provide a novel approach to signif- 
icantly improve efficiency in the Earth Science multi- 
sensor big data processing and analysis by deploying 
conceptually new workflow and state-of-the-art soft- 
ware technologies. 

II. CLARA Data Streaming Framework 

The NAIADS is integrated with CLARA, a Service 
Oriented Architecture (SOA) framework [5]. devel- 
oped at the Thomas Jefferson National Accelerator 
Facility (Jefferson Lab), and 0MQ socket library [6, 
7]. The CLARA framework is designed with a service- 
oriented architecture to enhance the efficiency, agility, 
and productivity of data processing tasks [8]. Data 
processing application, developed using the CLARA 
framework, consist of chained services, which are 
loosely coupled and can participate in multiple algo- 
rithmic compositions. It is important to mention that 
CLARA makes a clear separation between the service 
programmer and the data processing application de- 
signer. An application designer can be productive by 
designing and composing data processing applications 
using available, efficiently and professionally written 
software services without knowing service program- 
ming technical details. Services usually are long-lived 
and are maintained and operated by their owners in 
the distributed CLARA software. This approach pro- 
vides an application designer the ability and flexibil- 



ity to modify data processing applications by incorpo- 
rating different services in order to find optimal op- 
erational conditions, thus demonstrating the overall 
agility of the CLARA framework approach. 

This framework was designed based on a specific set 
of principles. As mentioned above, the fundamen- 
tal unit of CLARA based data processing application 
logic is the service. Services exist as independent soft- 
ware programs with a common interface defined by the 
framework. User classes, encapsulating specific algo- 
rithms and compliant to the required interface, can be Jg 
presented as CLARA services (the CLARA Software- $ 
as-a-Service: SaaS implementation). Each service has ^ 
its own set of data processing functionalities. These jjf 
functionalities or capabilities, suitable for invocation 
by other services, can be discovered via registration 
information available from the CLARA platform reg- 
istry. One of the service design recommendations is 
to keep a small and simple service code base, which 
will help future programmers to easily extend, modify, 
maintain and port services. Services must be agnostic 
to any eternal data processing logic. Services must be 
discoverable and able to take part in complex service 
compositions. By standardizing communication be- 
tween services, adapting a data processing application 
to changes in one of its components becomes easier 
and simplifies data transfer security (for example by 
deploying a specialized access control service). 
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Figure 1: CLARA 3-layer architecture. 

The CLARA architecture consists of tree layers as 
shown in Figure 1: The first layer is the xMsg Service 
Bus that provides the 0MQ-based publish-subscribe 
messaging system. Every service or component from 
the orchestration layer communicates via this bus, 
which acts as a messaging tunnel between services. 
Such an approach has the advantage of reducing the 
number of point-to-point connections between services 
required to allow them to communicate in the dis- 
tributed CLARA computing environment. The xMsg 
is a messaging system, build upon the 0MQ socket 
library [6], and can scale to tens of thousands of pro- 
cesses if needed. It implements communication pat- 
terns such as topic pub-sub, workload distribution, 
and request-response. The service layer houses the 


inventory of services used to build data processing ap- 
plications. The Administrative & Registration stores 
information about every registered service in the ser- 
vice layer, including address, description and opera- 
tional details. The orchestration of data analyses ap- 
plications is accomplished by the help of an applica- 
tion controller, resident in the orchestration layer of 
the CLARA architecture. 
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Figure 2: CLARA Service Bus performance: capability 
OF TRANSPORTING UP TO ~ 513 MB/SEC WITHIN SINGLE NODE. 

The benchmark measurements, Figure 2, were per- 
formed on an Intel 2.3 GHz i7 CPU, utilizing a sin- 
gle core. The results show that CLARA’s xMsg- 
messaging is capable of transporting 360 MByte/sec 
data between processes/services within a single node. 
The NAIADS data processing test-case implemen- 
tations suggest that data processing latency is ex- 
pected to be many orders of magnitude slower than 
CLARA/xMsg data transfer latencies, and CLARA 
framework overhead would be negligible. 

III. NAIADS Architecture 

The NAIADS design is based on the implementa- 
tion of specific or algorithms/functionalities, required 
for Earth Science data tasks, and their integration 
with the CLARA framework. The NAIADS archi- 
tecture and workflow is shown in Figure 3. Blue 
and grey rectangles represent framework’s Data Pro- 
cessing Environment (DPE) units, services are shown 
with circles, and blue arrows represent transient data 
flow. Data staging (SS), reading and pre-sorting (RS), 
concentrating (CS) and data- Event building (EB) is 
performed on dedicated nodes (within red dashed 
line), data- Event-based streaming and processing on 
Cloud/Cluster with minimized IO indicated within 
green dashed line. 

The overall workflow supports multi-stream data fu- 
sion by mapping input files into virtual memory from 
servers with optimal IO access to files. The raw data is 
then locally filtered and sorted into an in-memory data 
queue based on fusion parameters. Records from these 
queues undergo one or two levels of concatenation to 





Figure 3: NAIADS architecture and workflow: data 

STAGING (SS), IN-MEMORY READING (RS), DATA CONCENTRAT- 
ING (DS), and data- Event building (EB) is performed on 

DEDICATED NODES (WITHIN RED DASHED LINE) FOR 10 AND NET- 
WORKING optimization. Data- Event processing is scaled 
on Cloud/Cluster (within green dashed line). 

produce the final data- Events. In the case where all 
processing takes place over a local high-speed network, 
the first level of concatenation is unnecessary. In the 
case where data is fused from multiple remote loca- 
tions (Data Center 1, 2, etc.) it is advantageous to 
perform the final concatenation at the site that pro- 
vides the optimal volume/concentration of pre-sorted 
data. Completed data-Events are stored in a queue 
and can then be consumed by separate science pro- 
cessing workflows. 

Data Predictor Service (PS): The PS is a service 
for predicting time, location and geometry of near- 
coincident data for given sensors. This service involves 
orbital simulation of spacecraft location using Simpli- 
fied General Perturbations (SGP4), and modeling in- 
strument data acquisition mode (e.g. cross-track op- 
eration from LEO or scanning from GEO platform). 
Orchestrator : Deploys and configures services for user 
defined data processing, monitoring, exception han- 
dling and recovery processes. Orchestrator builds ap- 
plications based on available services, and it designs 
and controls data-flow by linking all services together. 
Data Staging Service (SS): The SS maps the initial 
data file into virtual memory, the data size is defined 
at the service configuration stage by user (Orchestra- 
tor). The SS is the only NAIADS system component 
that performs multi- file 10. 

Data Reader Service (RS): Once data is staged, the 
RS starts a worker and sender threads. Worker thread 
reads the input buffer, filters and sorts based on re- 
quired parameters, and fills the pre-sorted data queue. 
When the pre-sorted data queue is filled up to a de- 
fined water mark - sending-thread will start sending 
records to the Data Concentrator Service (CS). 

Data Concentrator Service (CS): Does the partial 
event building, by concatenating records found at the 


specific data center. This service is used to reduce the 
volume of communication between data centers, and 
to optimize the network load. 

Data Event Builder (EB): Is the final stage of com- 
plete data-Event building that generically combines 
data records from multiple data sources. Each data- 
Event represent a self-sufficient data object for algo- 
rithms defined by user (orchestrator) for entire appli- 
cation. 

Science Services (S): Science algorithms applied to the 
fused data-Events, which are streamed to all available 
nodes and multi-threaded to all available cores in each 
node. 

Data Streaming to Cloud/ Cluster: Process scaling is 
achieved by data-Events streaming to multi-CPU sys- 
tem, computing Cloud or traditional Cluster. 



Figure 4: The NAIADS’s workflow on one node: multi 

INSTRUMENT OBSERVATIONS DATA ARE STREAMED FROM THE 

Event Builder (EB), science algorithms (SI) are multi- 
core scaled, Data Statistics is produced (SS) and can be 

RETURNED TO USER’S ON-LINE INTERFACE. OUTPUT RESULTS ARE 
PERSISTED (WS). 

An example of NAIAD’s workflow on one node is 
shown in Figure 4. The data 10 and event build- 
ing are decoupled from process scaling on compute 
Cloud, and controlled by dedicated Orchestrators. 
The Event Builder (EB) fills in- memory data-Event 
queue, Reader Service (RS) sends data envelope to 
a Science algorithm (SI), which passes the processed 
data to the Writer Service (WS), which stores the sci- 
ence output in a specified way. The SI services are au- 
tomatically scaled within each node of compute Cloud. 
This example also includes passing output of Science 
algorithm to Statistical Service (SS), and then stream- 
ing results (e.g. histograms) into statistics in-memory 
queue, which can be displayed via Front End (FE) web 
user interface or stored on disk. 

The NAIADS transient message format (NcTransient) 
has been enhanced and more thoroughly integrated 
into the NAIADS prototype. NcTransient is based on 
the experimental NetCDF ncstream library, which de- 
fines an on-the-wire format for common data model 
(CDM) datasets. We started with ncstream, fixed 












bugs that prevented it from working with our datasets, 
and optimized it to handle our primary use case: fast 
serialization and de-serialization of CDM datasets to 
and from CLARA message payloads. 

The NcTransient protocol was enhanced to be both 
write and read optimized. For read optimization, 
we only de-serialize variable data if the client actu- 
ally reads it. We also store large variables in multi- 
ple “chunks” which may further limit de-serialization. 
For write optimization, NcTransient allows datasets 
or variables to be marked as read-only. This enables 
us to replace slow serialization with a fast byte array 
copy from existing de-serialization buffers. 

IV. Data Test Case 

Data fusion of SCIAMACHY Level- 1 observations, 
MODIS/Terra Level-2 (Cloud, Aerosol, and Land) 
data products, and ECMWF re-analysis data is used 
for NAIADS demonstration and performance tests in 
compute Cloud and Cluster environment. 



Figure 5: The Level-1 SCIAMACHY hyperspectral 

DATA FOR THE TEST CASE 1 : SIZE AND 5 GEOLOCATION 

POINTS OF NEAR-NADIR SCIAMACHY FOOTPRINT (LEFT), AND 
SCIAMACHY-derived spectral reflectance over ever- 
green FOREST AND ITS STD (RIGHT). 

Essential features of the SCIAMACHY Level 1 nadir 
spectral data are illustrated in Figure 5: 
o Nine years of near-nadir measurements; 
o From 10 AM Sun- synch orbit; 
o Swath 950 km (4 footprints in cross track); 
o Footprint size 30 km by 230 km; 
o 5 geo-location points per footprint; 
o SCIAMACHY Level- 1 data volume 2.2 TB. 

We implemented test case 1 - merging the data from 
SCIAMACHY Level- 1 data product and IGBP surface 
index. The IGBP surface index represent an auxiliary 
information provided in a static geo-grid with (1/6)° 
resolution. Data volume is 7 MB. 

V. NAIADS Performance Tests 

We have performed extensive test using Amazon Web 
Services (AWS) Cloud. The NAIADS AWS Cloud 
configurations included up to 16 compute nodes with 
32 cores (c4.8xlarge compute optimized instances) and 
data stored at the WAS S3 bucket. The test data was 
staged on each node’s local SSD storage. 


A. Python Implementation: We began our hands- 
on evaluation with pCLARA, the Python implemen- 
tation of the CLARA framework. Python is not an 
aggressively performant language by design and ini- 
tial evaluations showed that pCLARA with the stan- 
dard Cython interpreter lagged far behind the per- 
formance of a traditional C++ implementation. In 
an effort to match the performance of the traditional 
C++ implementation, a survey was performed of vari- 
ous higher performance Python dialects. Cython com- 
piles Python code to C++ modules that are callable 
directly from Python, which results in a performance 
boost of about 30%. Alternatively, Cython can be 
used to allow Python to call external C++ code more 
or less directly, which makes the Cython solution as 
performant as the C++ code that it is calling, with 
the caveat that overhead is added whenever Python 
data types are converted to C++ data types and back. 
PyPy, a just in time interpreter, was also evaluated 
and was shown to have very encouraging performance, 
but was not evaluated with pClara because it does not 
appear to support all of the necessary libraries at this 
time. The Cython and Cython wrapping C++ imple- 
mentations were ported back to pCLARA and were 
benchmarked. 
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Figure 6: NAIADS performance with pCLARA easily 

OUTPERFORMS A TRADITIONAL PYTHON SOLUTION AND CAN BE 
COERCED TO APPROACH THE PERFORMANCE OF A TRADITIONAL 
C++ SOLUTION. 


The results, illustrated in Figure 6, show that 
pCLARA can easily out perform well optimized 
Python, and can come very close to the performance 
of C++. It is worth noting that portability is not an 
issue with any of the above options as they will run 
anywhere that Python and pCLARA will run, with 
the only understanding that Cython scripts need to 
be compiled natively before being run on different ma- 
chines. For performance driven reasons, we will halt 
our evaluation of pCLARA as a computation frame- 
work and will reserve its use for code that does have 
strict time constraints, such as asynchronous monitor- 
ing and graphing. 


B. Java Implementation: NAIADS/CLARA sys- 
tem has shown very good linear scalability. Every 
node process a file isolated from the others, and the 
network communication is only control messages be- 
tween the orchestrator and the DPEs. Since the or- 
chestrator has been optimized to communicate with 
nodes in parallel, the overhead of using many nodes to 
run files in parallel is minimum. The Figure 7 shows 
the linear scalability when using up to 16 nodes to 
process the same set of files. 
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Figure 7: NAIADS’ linear scalability when using up to 

16 NODES FOR PROCESSING. EACH NODE HAS 12 CORES. 

VI. Conclusions 

We develop a conceptually novel framework for the 
Earth Science big data fusion. The NASA Information 
And Data System (NAIADS) software is integrated 
with CLARA framework and messaging based on the 
0MQ socket library. The team has implemented the 
1st data test case based on SCI AM ACHY Level- 1 hy- 
perspectral data and IGBP Map. The NAIADS sci- 
ence algorithms have been implemented as framework 
services: data reading, data-event building, quality 
control, spectral re-sampling, and product sorting and 
persistence on disk storage. The implementation was 
tested with a number of Python dialects and Java. 
The NAIADS transient data format, NetCDF stream- 
ing, and 10 services for NetCDF and HDF file for- 
mats were developed. Initial tests demonstrated linear 
multi-note and multi-core scaling. The Cloud config- 
urations for performance benchmarking are designed, 
tested, and the process is automated. 
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